Scalability Tuning of the Load Balancing and Coupling Framework FD4
نویسندگان
چکیده
In this paper, we discuss the scalability tuning of the HPC software framework FD4. The framework provides dynamic load balancing and model coupling for multiphase and multiphysics simulations. We first investigate scalability bottlenecks using the Vampir performance analysis tool-set on the BlueGene/Q system at JSC. Then, we describe and evaluate our optimized algorithms: A new hierarchical 1D partitioning algorithm for SFC-based dynamic load balancing and a new method to organize the coupling metadata. The final scalability benchmark shows that the overhead of FD4 has been reduced by a factor of 3.3 at 262 144 ranks, which leads to a considerably increased scalability of the whole application.
منابع مشابه
Highly Scalable Dynamic Load Balancing in the Atmospheric Modeling System COSMO-SPECS+FD4
To study the complex interactions between cloud processes and the atmosphere, several atmospheric models have been coupled with detailed spectral cloud microphysics schemes. These schemes are computationally expensive, which limits their practical application. Additionally, our performance analysis of the model system COSMO-SPECS (atmospheric model of the Consortium for Small-scale Modeling cou...
متن کاملFD4: A Framework for Highly Scalable Load Balancing and Coupling of Multiphase Models
More and more detailed simulation codes are developed promoted by the growing capability of high performance computers and the increasing knowledge about the underlying processes. This includes multiphase and multiphysics simulations as well as the coupling of multidisciplinary models. This paper introduces the framework FD4 (Four-Dimensional Distributed Dynamic Data structures), which enables ...
متن کاملVisualizing Process Composition and Load Balance in Parallel Coupled Models
Coupled model development presents a set of challenges broadly called the coupling problem; message-passing parallelism complicates matters, resulting in the parallel coupling problem. Performance tuning of parallel coupled systems is complex and performed largely in an ad hoc fashion; from the domain scientist’s perspective the figure of merit is throughput, which is the amount of simulation a...
متن کاملHighly scalable SFC-based dynamic load balancing and its application to atmospheric modeling
Load balance is one of the major challenges for efficient supercomputing, especially for applications that exhibit workload variations. Various dynamic load balancing and workload partitioning methods have been developed to handle this issue by migrating workload between nodes periodically during the runtime. However, on today’s top HPC systems – and even more so on future exascale systems – ru...
متن کاملEfficient Support for Matrix Computations on Heterogeneous Multi-core and Multi-GPU Architectures
We present a new methodology for utilizing all CPU cores and all GPUs on a heterogeneous multicore and multi-GPU system to support matrix computations efficiently. Our approach is able to achieve four objectives: a high degree of parallelism, minimized synchronization, minimized communication, and load balancing. Our main idea is to treat the heterogeneous system as a distributed-memory machine...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013